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Metadata  in  International  Database  Systems 
and  the  United  Nations  Common  Database 

(UNCDB) 


Introduction  — 

The  on-going  changes  in  technology  have 

provided  national  and  international 

statisticians  with  many  new  opportunities 

in  providing  statistical  data  and 

information  to  users.  This  paper  provides 

an  overview  of  current  practices  of 

providing  metadata  in  international  data         ^^^^^^^^ 

systems  and  focuses  particularly  on 

systems  that  are  designed  to  provide  data  via  Internet.  This 

paper  discusses  the  essential  elements  of  a  statistical 

metadata  system.  The  metadata  system  of  the  United 

Nations  Economic  and  Social  Information  System 

Common  Database  is  discussed  in  detail. 

I.  Metadata  Objectives 

The  recent  release  by  the  Data  Documentation  Initiative 
,  (DDl)  Committee  of  version  I  of  the  DDI  Document  Type 
Definition  (DTD)  has  once  again  brought  attention  to  the 
importance  of  metadata  for  social  and  behavioural  data- 
sets.  The  recent  changes  in  technology  have  produced 
many  new  opportunities  to  preser\  e  the  integrity  of  the 
data-sets  via  detailed  metadata,  (see  http:// 
www.icpsr.umich.edu/DDI  ). 

The  first  issue/objective  in  metadata  should  be  to  determine 
the  unit  of  analysis.  In  this  context  we  would  consider  it  a 
data-set.  Too  often  we  see  vague  references  to  data-sets. 
The  descripdon  of  data-set  should  be  standardised,  with  full 
citation  information  provided.  Without  such  basic  citation 
information  on  the  data-set.  there  is  very  little  value  in 
having  detailed  metadata. 

This  need  was  recognised  by  the  DDI  group  when  in  the 
introduction  to  their  work 

"...  the  (DDI)  is  an  effort  to  establish  an  intemaUonal 
criterion  and  methodology  for  the  content,  presentation, 
transport,  and  preservation  of  "metadata"  about  data-sets  in 
the  social  and  beha\ioural  sciences.  Metadata  (data  about 
data)  constitute  the  information  that  enables  the  effecthe. 
efficient,  and  accurate  use  of  those  data-sets..." 

Most  international  statistics  agencies  have  also  been 
acti\ely  developing  metadata  systems  in  conjunction  w ith 
their  databases  that  are  available  on  Internet.  In  the 
development  of  these  metadata  systems,  the  statistics 
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offices  have  had  to  consider  their 
objectives,  which  include  being  able  to 
provide  metadata  to  both  internal  and 
external  users.  Following  are  a  brief 
summary  of  the  type  of  objectives  that 
three  international  statistics  agencies 
have  adopted  in  the  development  of 
statistical  metadata  svstems. 


The  Statistical  Office  of  the  European  Communities 
(EUROSTAT)  considers  users  of  their  databases  needing: 

•  "assistance  in  the  search  for  data,  to  find  out  which 
data  are  actually  available  and  how  they  can  be 
retrieved  (data  must  be  accessible): 

•  help  to  understand  meaning  and  limitations  in  the 
use  of  the  data  in  detail:  they  need  element  for  proper 
interpretation  and  a  quality  assessment  of  the  data 
(data  must  be  documented): 

•  help  to  assess  the  reliability  and  the  quaUty  of  the 
data  in  detail;  they  need  to  know  methodological 
aspects  concerning  the  data,  along  the  stages  of  the 
statistical  Ufe  cycle  (data  must  be  usable)"  (see 
EUROSTAT,  2000). 

In  addition  to  the  above,  EUROSTAT  has  its  own  needs  for 
metadata  which  include  being  able  to  "collect,  produce, 
maintain  and  disseminate  harmonised,  documented  and 
high-quahty  statistical  data". 

The  OECD  considers  that  a  distinction  needs  to  be  made 
between  the  metadata  collected  and  compiled  by 
international  agencies  and  the  metadata  actually  made 
available  to  them.  They  consider  this  a  judgement  that  the 
statisticians  should  make  and  is  usually  dependent  upon  the 
method  of  dissemination  employed.  The  general  objecti\e 
being  to  provide  enough  detailed  metadata  to  fully 
understand  and  interpret  the  statistical  data  users  (see. 
OECD  1999). 

In  the  development  of  the  United  Nations  Economic  and 
Social  Information  System  Common  Database  (UNESIS- 
CDB).  metadata  standards  and  guidelines  were  adopted. 
These  provide  for  particular  priority  being  "given  to 
achieving  consistency  and  transparency  of  statistical 
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methods  across  countries  and  among  national  and 
international  sources  through  the  following  metadata 
objectives: 

•  Sound  citation  and  bibUographic  practices; 

•  Replicability  and  verifiability  of  data  and  estimates 
for  research: 

•  Open  access  to  data  and  metadata  using  Internet; 

•  Integration  of  metadata  and  data  in  electronic 
databases; 

•  Achievement  of  inter-database  consistency  in 
concepts,  classifications  and  definitions; 

•  Documenting  national  and  international 
differences  in  concepts,  classifications  and  definitions 
used  relative  to  international  standards  and  best 
practices."  (United  Nations.  1999). 

II.  "Core"  Sets  of  Metada  Proposed  by  Various 
Organisations 

The  work  on  the  DDI  (  see  http://www.icpsr.umich.edu/ 
DDI  )  highlights  two  approaches  to  metadata  that  are 
currently  employed,  one  more  extensive  than  the  other.  For 
convenience  I  will  describe  them  as  the  Ubrary  and  research 
methods. 

The  Library  metadata  method: 

The  Ubrar}'  method  is  based  on  a  bibUographic  catalogue 
entry.  With  the  introduction  of  computer  data  files, 
detailed  rules  were  de\  eloped  for  cataloguing  computer 
files  (see  Dodd.  1982  and  Gorman  and  Winkler  1988). 
Further  work  in  this  area  was  undenaken  during  the  1990"s 
by  the  Data  Documentation  Initiative  committee. 

The  Data  Documentation  Initiative  committee  produced 
version  1  of  the  document  Type  Definition  (DTD)  for 
social  science  data  in  March  2000.  The  DTD  provides  the 
"markup"  for  use  in  E.xtensible  Markup  Language  (XML). 
The  highest  level  components  of  the  DTD  are: 

•  Document  Description  -  items  describing  the 
marked-up  document  itself  as  well  as  its  source 
documents  (citation,  title,  etc.) ; 

•  Study  Description  -  items  describing  the  overall 
data  collection  (title,  citation,  methodology,  study 
scope,  data  access,  etc.); 

•  Data  Files  Description  -  items  relating  to  the 
format,  size,  and  structure  of  the  data  files  : 

•  Variables  Description  -  items  relating  to  variables 


in  the  data  collection; 

•  Other  Study-Related  Materials  -  other  study- 
related  material  not  included  in  the  other  sections 
(bibhography.  separate  questionnaire  file,  etc.) 

The  complete  DTD  version  1  is  available  at  <  http:// 
www. icpsr.umich.edu/DDI  > 

The  United  Nations  system  has  developed  a  Core  Set  of 
Metadata  for  Use  by  UN-System  Organisations  (see 
www.unsystem.org/acc/statements/doc-manage/core- 
metadata.html  ).  This  system  is  designed  for  documents 
and  pro\ ides  guidelines  for  dealing  with: 

•Handle  metadata; 
•Context  metadata; 
•Content  metadata; 
•Access  and  use  metadata; 
•Structural  metadata. 

The  Research  method: 

United  Nations  statistical  reports 

If  we  consider  what  statistical  metadata  could  be  included 
for  a  sample  sun.  ey,  u  e  end  up  \\  ith  a  ven.-  detailed  list  of 
possibihties.  If  we  follow  the  United  Nations 
recommendations  for  the  Preparation  of  Sample  Sinrey 
Repons  (United  Nations  1964)  we  would  have  a  very 
detailed  metadata  which  would  include  the  following; 

General  Report: 

•Statement  of  purposes  of  the  survey; 

•Description  of  the  coverage; 

•Collection  of  information; 

•Repetition; 

•Numerical  results; 

•Date  and  duration; 

•Accuracy; 

•Cost: 

•Assessment: 

•Responsibihty: 

•References. 

Technical  Report: 

•Specification  of  the  frame: 

•Design  of  the  survey: 

•Personnel  and  equipment; 

•Statistical  analysis  and  computational  procedure; 

•Accuracy  of  the  sup.e\-; 

•Accuracy,  completeness  and  adequacy  of  the 

frame; 

•Comparisons  with  other  sources  of  information: 

•Costing  anahsis: 

•Efficiency: 

•Observations  of  technicians. 
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To  include  sufficient  detail  on  all  the  above  is  indeed  a  very 
daunting  task,  but  one  that  is  undertaken  routinely  by 
statisticians.  As  we  have  moved  from  the  statistical  sample 
survey  reports  being  published  in  printed  form  to  now 
being  published  via  Internet,  the  task  of  the  statistician  to 
keep  the  above  statistical  metadata  attached  to  the  statistics 
is  also  daunting. 

International  Labour  Organization  (ILO) 

The  International  Labour  Organization.  Bureau  of  Statistics 
provides  extensive  metadata  in  LABORSTA,  the  Bureau's 
principal  database.  The  LABORSTA  database  consists 
mainly  of  annual  time-series  which  serve  to  publish  the 
ILO  Yearbook  of  Labour  Statistics.  Metadata  in 
LABORSTA  as  can  be  seen  by  the  following  is  provided  at 
the  general  and  country  specific  levels.  For  example,  the 
LABORSTA  database  has  detailed  metadata  for  the 
statistical  series  Unemployment  he  following  extract  has 
been  adapted  provides  the  items  headings  for  metadata 
included  for  the  following  general  level  item  headings: 

General  level 

•Classification  schemes: 

•Intercountry  comparisons; 

•Source; 

•Quality  statements; 

•Definitions  used; 

•Period  covered. 

For  specific  detail  on  metadata  included  for  unemployment: 
see  <  http://laborsta.ilo.org/appL/data/c3e.html  > 

For  each  countiy  there  is  specific  metadata  related  to  each 
statistical  series.  For  example,  in  the  case  of 
unemployment  statistics  metadata  on  the  following  country 
survey  information  is  provided: 

Country  survey  level 

•Title  of  the  survey; 

•Organisation  responsible  for  the  survey; 

•Coverage  of  the  survey; 

•Periodicity  of  the  survey; 

•Reference  period; 

•Topics  covered  by  the  survey: 

•Concepts  and  definitions; 

•Classifications  used; 

•Sample  size  and  design; 

•Field  work; 

•QuaUty  controls; 

•Weighting  the  sample; 

•Sampling  errors; 

•Adjustments; 

•Seasonal  adjustment; 

•Non-sampling  errors; 

•Historv'  of  the  survey; 

•Documentation. 


For  specific  detail  on  metadata  included  for 
Unemployment:  for  Belgium  see  <  http://laborsta.ilo.org/ 
appl/data/ssm3e/BE.html  > 

These  two  examples  provide  an  indication  of  the  scope  and 
depth  of  statistical  metadata  that  can  be  captured  and 
included  in  a  statistical  metadata  system.  Note  that 
information  is  provided  unique  to  each  country's  data-set, 
even  in  the  lLO"s  data-set  which  includes  many  countries. 
This  makes  for  httle  standardisation  in  international 
reporting.  Other  international  statistical  agencies  have 
provided  or  plan  to  provide  varying  levels  of  metadata 
depending  upon  the  needs  of  their  users  and  their  own 
metadata  strategy.  A  complete  listing  of  international 
statistical  agencies  can  be  found  at  <  http://www.un.org/ 
Depts/unsd/gs  intstat.htm  >. 

Methods  of  providing  metadata 

In  international  statistics  level  we  have  two  main 
approaches  are  currently  being  used  to  provide  statistical 
metadata;  the  repository  method  and  the  agreed  standard 
method. 

Repository  method 

The  repository  method  involves  the  international  statistical 
agency  obtaining  all  relevant  statistical  metadata  from  the 
national  statistical  office  and  establishing  a  repository  of 
that  information  in  their  database.  In  previous  times  the 
statistical  metadata  was  compiled  in  a  statistical 
compendium  (see  United  Nations  1977)  as  an  example  of 
this  practise.  The  ILO  and  IMF  currently  use  this  method 
for  their  statistical  metadata.  See  <  http://laborsta.ilo.org/ 
appl/data/c3e.html  >  and  <  http://laborsta.ilo.org/appl/data/ 
ssm3e/BE.html  >  for  examples  of  the  statistical  metadata 
held  by  for  the  unemployment  series. 

This  cross-referencing  of  statistical  metadata  amongst 
international  agencies  is  starting  to  develop.  For  example 
the  ILO  web  sites  has  a  pointer  to  IMF's  web  site  for 
detailed  national  practices  on  such  items  as  Consumer  Price 
Indices  ( see  http://www.ilo.org/pubhc/english/bureau/stat/ 
guides/cpi/index.htm  ). 

The  major  limitation  of  the  repository  method  is  its  static 
nature.  To  obtain  statistical  metadata,  a  questionnaire  is 
usually  sent  to  the  national  statistics  office,  with  the 
metadata  being  integrated  into  the  international  agencies" 
intranet  site.  The  other  method  of  obtaining  the  statistical 
metadata  is  by  placing  the  onus  on  the  national  statistical 
office  to  pro\ide  the  statistical  metadata  as  is  the  case  with 
the  IMF's  DSSD. 

Metadata  coverage  in  repository  systems 

Whilst  in  theory  the  internet  is  an  ideal  solution  to 
pro\  iding  linkage  of  metadata  to  data-sets  and  the 
information  necessaiy  for  full  understanding  of  the 
statistical  data,  the  reahty  is  far  from  this  ideal  situation. 
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Figure  1. 
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The  metadata  in  many  cases  does  not  currently  exist,  and 
when  it  does,  the  coverage  is  often  limited.  The  are 
however  exceptions  to  this,  the  ILO's  LABORSTA 
database  is  a  good  example  of  a  detailed  and 
comprehensive  statistical  metadata  repository. 

International  agreed  standards  method 

The  second  major  method  involves  metadata  definitions 

that  are  the 

internationally  agreed 

standards/definitions  in 

each  field  of  statistics. 

Using  the  internationally 

agreed  standards/ 

definitions  as  the 

benchmark,  differences 

from  these  international 

definitions  are 

highlighted.  We  aim  to 

have  a  one  to  one 

consistency  between 

statistical  series  and 

classification 

terminology  and  the 

national  and 

international  standards. 

There  are  a  number  of 
advantages  in  taking  this 
approach  to  statistical 
metadata  definitions. 
The  administration  of 
the  statistical  metadata  is 
very  much  simplitled 
since  the  editing  and 
updating  process  is  easy 
to  do  in  a  practical 
sense.  Accessing  data  is 
easier  as  this  approach 
greatly  improves  the 
facilities  for  searching 
data.  Further  down  the 
development  process, 
this  approach  will 
highlight  problems  and 
gaps  in  international 
standards.  Users  will  be 
better  able  to  follow 
issues  and  comparabiUty 
of  data  if  the  common 
standard  is 
benchmarked.  rather 
than  having  to  deduce 
differences  from  each 
data  supplier's  diverse 
literature. 


The  major  hmitation  to  this  approach  is  that  whilst  national 
statistical  offices  agree  at  the  international  level  on 
standard  methods  and  definitions,  in  practice  it  is  common 
that  these  are  not  followed.  The  data  that  are  provided 
come  with  either  a  technical  description  of  the  methods 
and  definitions  employed  or  at  worst  a  statement  that  the 
standards  have  been  followed  to  the  extent  possible.  In  the 
latter  situation  this  leaves  the  statistician  and  subsequent 
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user  with  no  information  on  potentially  important 
differences. 

III.  UNESIS  Common  Database  (UNCDB) 

The  UNESIS  production  databases  plus  a  variety  of  other 
sources  are  used  for  compilation  of  the  UNESIS  Common 
Database,  which  is  intended  for  harmonization  and 
dissemination  of  data  in  a  single  unified  framework  rather 
than  data  production  in  specific,  speciaUzed  fields.  The 
Common  Database  comprises  a  comprehensive  core  of  data 
from  the  global  statistical  system  (see  figure  1 ).  For 
detailed  information  on  the  UNESIS  project  and  the 
UNESIS  Common  Database  see  United  Nations  Economic 
and  Social  Information  System  (UNESIS)  http:// 
wwvv.un.org/Depts/unsd/foruni/forum.htni 

Summary  of  infotype:    Ufe  expectancy  by  sex  (UN 
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UNESIS  Common  Database  Metadata 

The  UNESIS  Common  Database  has  a  different  approach 
to  metadata  than  do  most  national  and  international 
statistics  offices.  The  UNESIS  Common  Database  metadata 
definitions  are  the  internationally  agreed  definitions  in  each 
field  of  statistics.  Using  the  internationally  agreed 
definitions  as  the  benchmark,  differences  from  these 
international  definitions  are  highlighted.  We  aim  to  ha\e  a 
one  to  one  consistency  between  infotype  and  dimension 
terminology  and  the  national  and  international  standards. 

There  are  a  number  of  advantages  in  taking  this  approach  to 
statistical  metadata  definitions.  The  administration  of  the 
statistical  metadata  is  very  much  simplified  since  the 
editing  and  updating  process  is  easy  to  do  in  a  practical 
sense.  Accessing  data  is  easier  as  this  approach  greatly 
improves  the  facilities  for  searching  data.  Further  down  the 
development  process,  this  approach  will  highlight  problems 
and  gaps  in  international  standards.  Users  will  be  better 
able  to  follow  issues  and  comparability  of  data  if  the 
common  standard  is  benchmarked,  rather  than  having  to 
deduce  differences  from  each  data  supplier's  diverse 
literature. 

The  major  limitation  to  thus  approach  is  that  whilst  national 
statistical  offices  agree  at  the  international  level  on  standard 
methods  and  definitions,  in  practice  it  is  common  that  these 
are  not  followed.  The  data  that  are  pro\  ided  come  w  ith 
either  a  technical  description  of  the  methods  and  definitions 
employed  or  at  worst  a  statement  that  the  standards  have 
been  followed  to  the  extent  possible.  In  the  latter  situation 
this  leaves  the  statistician  and  subsequent  user  with  no 
information  on  potentially  important  differences. 


Detailed  metadata  is  provided  in  UNESIS  Common 
Database  for  all  data  infotypes  and  dimensions.  These 
include:  source;  a  data  dictionary  definition  of  every  term 
appearing  in  the  infotype  and  dimensions.  An  overview  of 
the  UNESIS  Common  Database  metadata  schema  is  shown 
in  tlgure  2. 

Source  citation 

The  UNESIS  Common  Database  standard  is  to  cite  printed 
or  other  permanent  public  sources  in  order  to  ensure  that 
users  can  find  the  source  indicated  at  any  future  date.  These 
sources  are  specific  as  to  page/paragraph  or  table  used  as 
well  as  providing  the  standard  citation  information  on  place 
published,  pubUsher,  date  and  author  (or  responsible 
office). 

Some  other  examples  of  citation  guidelines  and  formats  for 
national  and  intemadonal  data  sources  are  given  by  the 
International  Monetary  Fund  and  United  Nations  Statistics 
Division  on  their  Internet  sites  ("Dissemination  Standards 
Bulletin  Board"  and  "Special  Data  Dissemination 
Standards",  http://dsbb.imf.org  . 

Data  Dictionary 

The  data  dictionary  in  UNESIS  Common  Database  is  used 
to  promote  integration  and  consistency  among  databases 
and  a  common  language  among  data  producers  and  users. 
The  data  dictionary  provides  definitions  and  references  to 
all  terms  that  are  found  in  a  statistical  table.  The  United 
Nations  Statistics  Division  has  developed  such  dictionaries 
for  its  print  pubhcation  the  United  Nations  World  Statistics 
in  Brief  and  its  Internet  publication  Monthly  Bulletin  of 
Statistics  On-line  <  www.un.org/Depts/unsd/mbs.html  > 
and  as  part  of  the  metadata  system  of  the  UNESIS 
Common  Database. 

Deflnitions 

The  procedure  followed  in  preparing  the  data  dictionary  has 
been  as  follows: 

•  Only  terms  appearing  in  column  heads  and  rows  of 
actual  pubUshed  print  or  electronic  statistical  tables  are 
included  but  all  such  terms  must  be  defined  in  some 
way.  The  language  used  in  table  titles  tends  to  be  more 
descriptive  and  less  precise,  so  terms  found  there  are 
not  included; 

•  Definitions  are  given  with  strict  adherence  to  the 
language  of  the  original  international  recommendation 
or  some  other  accepted  reference  if  there  is  no  directly 
applicable  international  recommendation,  but  with 
some  allowance  for  intelligibility  by  a  non-technical 
audience.  Definitions  gi\  en  by  the  international 
organizations  by  way  of  explanation  in  technical  notes 
to  their  various  publications  are  not  used  as  they  have 
not  been  through  the  same  inter-governmental  re\ lew 
process  as  the  original  recommendations. 
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References 

The  definition  references  are  specific  as  to  page/paragraph 
or  table  used  as  well  as  providing  the  standard  citation 
information  on  place  published,  publisher,  date  and  author 
(or  resp>onsible  office). 

Topics 

All  infotypes  (series)  and  dimensions  are  classified  by 
topic.  The  topics  hst  is  based  on  the  standard  ACC 
Subcommittee  on  Statistical  Activities  classification  of 
statistical  programmes  (see  annex  4).  This  classification 
provides  a  useful  thematic  framework  of  infotypes  and 
dimensions  for  users. 


•A  topic  can  be  associated  with  many  infotypes; 

•A  data  dictionarv'  definition  can  be  associated  with 

many  infotypes; 

•A  data  dictionary  definition  can  be  associated  with 

many  dimensions. 

•A  data  dictionar.'  definition  can  be  associated  with 

onlv  one  reference. 


Data  footnotes 

Data  footnotes  in  the  LTNESIS 
Common  database  are  primarily 
used  to  indicate  de\  iadons  of 
the  data  from  the  international 
standard  definitions.  Data 
footnotes  can  be  associated  with 
cells  only;  this  is  in  order  to 
avoid  the  footnote  being 
detached  if  selectively  accessed. 
When  data  are  imported  to  the 
UNESIS  Common  Database 
and  the  infotype  has  a  footnote, 
the  footnote  is  carried  onto  all 
the  cells  under  the  infotype. 

Relationships  between 
UNESIS  Common  Database 
data  and  metadata 

The  relationships  between  the 
UNESIS  Common  database 
data  items  (infotypes. 
Dimensions  and  Elements)  and 
the  metadata  ( see  figure  2 )  can 
be  summarized  as  follows: 

•An  infotype  has  to  be 
associated  with  only  one 
source; 

•An  infotype  can  be 
associated  u  ith  one  or 
more  topic; 
•An  infotype  can  be 
associated  \\  ith  one  or 
more  data  dictionary 
definitions; 
•A  dimension  can  be 
associated  with  one  or 
more  data  dictionary- 
definitions; 
•A  source  can  be 
associated  \\  ith  many 
infotypes; 


Figure 2     Infotypc  and  Metadata  Relationships  in  the  UN 
Common  Database 
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Metadata  and  collaboration  with  the  regional 
commissions 

The  statistics  divisions  of  the  United  Nations  regional 
commissions  are  uniquely  placed  to  ensure  close, 
continuous  communication  with  national  statistical  services 
on  countries  most  recent  data  availability  and  on  details  of 
national  sources,  methods  and  definitions.  The  UNESIS 
project  team  is  collaborating  with  the  commissions  to  take 
advantage  of  this  regional  experience.  The  objectives  of  this 
regional  collaboration  are  to  ensure  commonality  of  data  to 
the  greatest  extent  possible  at  regional  and  global  levels, 
much  quicker  updating  and  availabihty  and  minimising 
demands  on  national  services  for  international  data 
compilation.  The  metadata  included  in  the  UNESIS 
Common  Database  provide  a  key  for  international  data 
sharing  and  exchange  with  the  regional  commissions  in  a 
common  format. 
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Annex  1. 

UNESIS  Common  Database  Sources 

Carbon  Dioxide  Information  Analysis  Center 

Food  and  Agriculture  Organization  of  UN  - 

Database  (FAOSTAT) 

ILO  -  Estimates  and  Projections  of  Economically 

Active  Population 

ILO  -  Labour  Statistics  Yearbook  Database 

IMF  -  International  Financial  Statistics 

International  Civil  Aviation  Organization  (ICAO) 

Database 

International  Telecommunications  Union  (ITU) 

OECD  -  Development  Assistance  Database 

UN  -  Operational  Activities  for  Development 

Database 

UN  Population  Division  -  Population  Estimates  and 

Projections 

UN  Statistics  Division  -  Commodity  Trade  Statistics 

Database  (COMTRADE) 

UN  Statistics  Division  -  Demographic  Yearbook 

Database 

UN  Statistics  Division  -  Energy  Statistics  Database 

UN  Statistics  Di\  ision  -  Industrial  Conunodities 

Production  Database 

UN  Statistics  Division  -  Industrial  Production  Index 

Numbers 

UN  Statistics  Division  -  International  Trade  Statistics 

Aggregates 

UN  Statistics  Division  -  National  Accounts 

Estimates  of  Statistics  Division 

UN  Statistics  Division  -  National  Accounts  Yearbook 

Database 

UN  Statistics  Division  -  Producer  Price  Index 

Numbers 

UN  Statistics  Division  -  Transport  Statistics  Database 

UNESCO  -  Statistical  Yearbook 

World  Bank  -  Development  Indicators  and  Finance 

World  Health  Organization  and  UN  Health  Indicators 

World  Intellectual  Property  Organization  -  Statistics 

Database 
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World  Tourism  Organization  -  Statistics  Database 

UN  Population  Di\ision  -  Estimates  on  Urban/rural 

Population  * 

UN  Development  Policy  Analysis  Division.* 

UN  Economic  Commission  for  Africa* 

UN  Economic  Commission  for  Europe* 

UN  Economic  Coitmiission  for  Latin  America  and 

Caribbean* 

UN  Economic  and  Social  Commission  for  Asia  and 

the  Pacific* 

UN  Economic  and  Social  Commission  for  Western 

Asia* 

*  To  be  added. 
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Annex  4. 

UNESIS  Common  Database  Topics  list  (from  ACC 
Subcommittee  on  Statistical  Activities) 


Agriculture,  forestry  and  fishing 

Balance  of  payments 

Communication  and  culture 

Construction 

Development  assistance 

Economically  active/not  economically  active 

population 

Education  and  learning 

Energy 

Environment 

Financial  statistics 

Health,  health  services;  impairment,  disabilities; 

nutrition 

Households  and  families,  marital  status,  fertility 


Human  settlements,  housing,  geographic  distribution 

of  population 

Income,  consumption  and  wealth 

Industrial  production 

International  finance 

International  tourism 

International  trade 

Mining  and  quarrying 

National  accounts 

Population  composition  and  change 

Prices 

Science  and  technology,  intellectual  property 

Social  security  and  welfare  services 

Socio-economic  groups  and  social  mobility 

Time  use 

Transport 

Women  and  men 

Other  social  fields 

F^iblic  order  and  safety 
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Authenticity  as  a  Requirement  of  Preserving 
Digital  Data  and  Records 


Abstract 

Assuring  continued  authenticity  is  an 

essential  but  intransigent  presenation 

consideration  for  digital  data  and  records. 

Several  key  issues  need  to  be  addressed; 

Which  intellectual  and  technical  elements 

of  data  and  records  are  essential  for 

assuring  authenticity;  how  should  these  ^^^^^^^^B 

be  maintained  and  represented  over  time; 

and  how  are  authentic  data  and  records  used  in  \  arious 

systems  of  practice?  The  authors  will  address  these 

questions  in  Ught  of  case  studies  and  inter\  iew  s  being 

conducted  with  government  agencies,  academic  institutions. 

and  various  organizations  in  America.  Canada.  Europe  and 

Asia  by  the  InterPARES  ( International  Research  on 

Permanent  Authentic  Records  in  Electronic  Systems) 

Project.  This  article  will  also  discuss  initial  project  findings 

as  they  relate  to  the  specific  characteristics  of  authenticity 

in  the  preservation  of  digital  data  and  records. 


I.  Introduction 

Why  is  it  important  to  know  that  preser%ed  digital  data  and 
records-  are  authentic?  How  do  we  define  authenticity? 
How  do  we  know  that  received  digital  data  and  records  are 
authentic?  How  are  we  assured  that  the  digital  data  and 
records  are  as  authentic  when  we  retrieve  them  as  they  were 
w hen  they  were  first  stored  and  preserved? 

These  questions  are  large  in  scope.  Our  presentation 
e.xplores  the  notion  of  the  significance  of  authenticity  in  the 
management  of  records  and  data  and  reports  upon  the  w  ork 
of  the  InterPARES  project  currently  underway. 

The  records  generated  by  society,  whether  in  the  course  of 
government,  business  or  pri\  ate  activity,  need  to  be 
maintained  and  preserved  as  a  mechanism  for 
accountability;  as  evidence  of  indi\  idual  and  corporate 
rights;  and  as  a  form  of  long-term  memorv .  In  the  paper 
world,  documentary  forms  and  procedures  have  developed 
o\ er  time  to  ensure  that  records  are  capable  of  serving  as 
evidence  of  acti\ity  -  to  be  so.  the  records  must  be  both 
reliable  and  authentic.  Reliabiliry  can  be  defined  as  the 
trustworthiness  of  the  content  of  the  record,  which  is 
ascertainable  through  an  examination  of  the  completeness 
of  the  record  and  of  the  procedures  exercising  control  over 
its  creation' .    Charles  M.  Dollar  states  that  "Archival 
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science  defines  authentic  records  as  being 
what  they  purport  to  be  —  reUable 
records  that  over  time  ha\e  not  been 
altered,  changed  or  otherwise 
corrupted."""'  Authenticity  guarantees  that 
the  record  is  not  changed  or  manipulated 
after  it  has  been  created  or  received  or 
■■■^■^^^     migrated  over  the  w  hole  continuum  of 

records  creation,  maintenance  and 
preservadon" .  In  the  context  of  records  as  legal  evidence, 
authenticity  is  an  absolute  concept  in  that  it  either  exists  or 
does  not.  There  is  no  relative  degree  of  authenticity,  while 
there  may  be  for  reliabiUty.  The  status  of  being  authentic, 
however,  can  change  at  any  moment  as  a  result  of  residual 
effects  of  an  action  or  migration  that  has  been  performed  on 
the  record  over  time.  This  is  the  case  for  digital  data  as 
well.  By  contrast,  authentication  is  the  process  of 
guaranteeing  the  authenticity  of  a  record" .  If  authenticity  is 
the  status  of  being  authentic,  then  authentication  is  the 
action  or  set  of  activities  that  demonstrate  that  something  is 
authentic. 

When  it  is  created,  a  record  has  two  indispensable 
components:  its  content  and  the  medium  to  which  that 
content  is  affixed.  With  traditional  paper  records,  the 
content  of  a  record  could  not  be  separated  from  its  medium. 
In  the  case  of  an  electronic  record,  however,  its  content  can 
be  separated  from  the  original  medium  and  transferred  to 
another  medium  or  even  to  multiple  other  media.  Even 
maintaining  the  same  type  of  medium,  an  electronic  record 
can  be  migrated  to  another  hardware  and  software 
environment,  thus  effectively  breaking  the  bond  between 
content  and  medium.  Due  to  the  physical  separation  of  the 
content  from  the  media,  as  well  as  the  various  ways  in 
which  the  integrity  of  the  record"s  content  can  be 
compromised  during  the  migration  processes,  the 
authenticity  of  the  record  is  vulnerable.  To  address  and 
overcome  this  \  ulnerabiUty.  mcreasing  emphasis  is  being 
placed  in  many  communities  on  the  de\elopment  and 
implementation  of  authentication  processes  to  ensure  and 
demonstrate  the  authenticity  of  the  record.  Authentication 
processes  have  always  included  both  methodological  and 
procedural  techniques  for  assuring  authenticity,  although 
with  traditional  records,  these  techniques  ha\e  tended  to  be 
more  implicit  than  explicit,  for  example,  through 
demonstration  of  an  unbroken  chain  of  custody  for  a  record 
and  through  archival  description  .  There  has  been 
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increasing  concern,  therefore,  about  understanding  (i.e.. 
identifying  and  defining)  the  quality  and  processes 
associated  with  authenticity  and  authentication  of 
information  objects  within  the  digital  environment. 

II.  Authenticity  and  the  InterPARES  Project 

In  recent  years,  along  with  the  rapid  growth  of  electronic 
communications  and  information  systems,  digital  records 
and  data  have  presented  new  challenges  and  opportunities 
to  a  variety  of  communities  of  records.  For  example,  the 
legal  community  is  concerned  that  digital  records  are 
legally  rehable  as  evidence.  Although  attorneys  on  either 
side  of  a  case  may  interpret  the  record  differently,  the 
records  themselves  must  somehow  be  demonstrated  as 
being  as  authentic  when  we  retrieve  them,  as  they  were 
when  they  were  first  stored  and  preserved.  In  healthcare,  it 
is  critical  that  digitally  stored  x-rays  retrieved  perhaps,  in 
connection  with  a  court  case,  or  to  evaluate  a  treatment 
decision,  are  identical  in  resolution  and  color  when  we 
retrieve  it.  as  when  they  were  stored  and  preserved.  In 
computer  network  communication  systems,  it  is  important 
to  establish  the  security  of  a  transmission,  a  message,  a 
station,  or  an  originator,  by  ensuring  that  the  sender 
transmits  a  message  only  to  an  intended  receiver  and  that 
the  message  has  not  been  altered  in  route.  To  the  archival 
community,  the  significance  of  the  description, 
identification  and  preservation  of  digital  materials  is 
-increased  as  a  result  of  the  evidence-based  approach  to  the 
management  of  records. 

These  needs  and  concerns  raise  several  research  questions 
concerning  the  establishment  of  the  authenticity  of  digital 
records  and  data: 

Which  intellectual  and  technical  elements  of  data 
and  records  are  essential  for  ensuring  authenticity  in 
different  communities  of  practice? 

Can  these  the  requirements  for  ensuring  the 
authenticity  of  data  and  records  be  applicable  across 
jurisdictional  and  technological  boundaries? 

How  should  authentic  data  and  records  be 
maintained  over  time? 

By  identifying  the  requirements  for  ensuring  authenticity, 
the  InterPARES  project  also  hopes  to  answer  these 
questions:  What  is  a  record  and  what  is  data? 

The  overall  focus  of  the  project  is  the  long-term 
preservation  of  \  ital  organizational  records  and  critical 
research  data  created  or  maintained  in  electronic  systems 
and  which  must  be  preserved  permanently  for 
administrative,  legal  or  cultural  reasons.  The  InterPARES 
Project  is  a  collaborative  effon  among  fourteen  countries  to 
develop  strategies,  policies  and  standards  of  authenticity 
and  preservation  of  electronic  records  within  archives. 


Research  is  divided  into  four  interrelated  investigative 
domains:  ( 1 )  the  conceptual  requirements  for  preserving 
authentic  electronic  records;  ( 2 )  appraisal  criteria  and 
methodology  for  authentic  electronic  records;  (3) 
methodologies  for  preserving  authentic  electronic  records; 
and  (4)  development  of  policies,  strategies  and  standards  to 
ensure  preservation  of  the  authenticity  of  those  records. 
The  goal  of  the  first  research  domain,  which  is  concerned 
with  authenticity,  is  to  identify  the  elements  of  electronic 
records  which  are  necessary  to  maintain  the  authenticity  of 
those  records  over  time  through  an  analysis  of  the  elements 
of  physical  and  intellectual  form  which  may  affect  the 
authenticity  and  nature  of  an  electronic  record.  Task  forces 
in  each  domain  are  using  methodologies  including 
diplomatic  analysis,  structured  interviews,  literature 
reviews,  systems  analysis  and  design,  and  activity  and 
entity  modeling.  The  four  task  forces  each  focus  on 
Authenticity.  Appraisal.  Preservation  and  PoUcy 
Development. 

III.  Preservation  and  the  InterP.\RES  Project 

The  importance  of  determining  and  analyzing  the 
presersation  function,  institutional  needs  and  long-term 
expectations  of  use  and  accessibility  of  electronic  records 
underlies  the  research  questions  driving  the  InterPARES 
Preservation  Task  Force.  The  first  goal  of  the  Presen.'ation 
Task  Force  is  to  identify  and  develop  the  procedures  and 
resources  required  for  the  implementation  of  the  conceptual 
requirements  and  criteria  identified  in  the  first  two  research 
domains.  Broadly  put.  responses  to  the  research  questions 
will  incorporate  an  examination  of  the  present  state  of  long- 
term  preservation  either  in  use.  or  in  development; 
articulate  an  understanding  of  procedural  and  technical 
methods  of  authentication  for  presep.'ed  electronic  records; 
proxide  data  about  the  principles  and  criteria  for  media  and 
storage  management  required  for  preservation  of  authentic 
electronic  records;  and  lastly,  enable  the  development  of  a 
statement  of  responsibilities  for  the  long-term  preservation 
of  authentic  electronic  records. 

The  second  goal  of  the  Preservation  Task  Force  is  to  model 
the  preservation  function  and  implementation,  which  will 
be  based  on  information  gathered  from  responses  to  the 
research  questions.  The  institutional  investigators  working 
at  the  various  national  archival  institutions  will  test  models. 
Through  an  iterative  process,  results  will  be  brought  back 
to  the  International  Team  and  will  be  used  to  further  refine 
the  models,  which  will  then  be  re-tested.  This  process  is 
expected  to  reveal  certain  basic  principles  upon  which 
strategies,  policies  and  standards  for  the  preservation  of 
authentic  electronic  records  can  be  drafted. 

IV.  Method  and  Findings  to  Date 

The  project  uses  case  studies  to  analyze  requirements  for 
authenticity  based  on  an  analysis  of  features  of  records  and 
their  genesis,  using  a  research  methodology,  which  is 
deri\  ed  from  diplomatics.  Diplomatics  is  an  analytical 
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method  developed  in  Europe  in  the  seventeenth  and 
eighteenth  centuries  to  determine  the  authenticity  and 
rehabiUty  of  historical  documents.  In  the  process  of  its 
introduction  into  most  European  countries,  diplomatics 
grew  into  a  very  sophisticated  system  of  ideas  and  methods 
about  the  namre  of  records,  their  creation  and  their 
relationships  with  the  actions  and  persons  connected  to 
them  and  with  their  organizational,  social,  and  legal 
context'^ .  The  concepts  and  principles  of  contemporary 
diplomatics  have  been  appUed  in  ongoing  electronic  record- 
keeping research"'  including  InterPARES  and  have  proven 
effective  in  identifying  technical  and  procedural 
requirements  for  ensuring  the  rehabiUty  and  authenticity  of 
electronic  records'". 

The  InterPARES  project  has  developed  a  typology  of  the 
conceptual  requirements  for  authenticity  for  different  types 
of  electronic  records:  a  Case  Study  InterA'iew  Protocol 
(CSIP)  and  the  Template  Element  Data  Gathering 
Instrument  (TEDGI).  These  are  the  tools  used  to  gather  the 
empirical  data,  and  perform  diplomatic  analysis  of 
electronic  records  and  systems  to  create  the  electronic 
record  typology.  The  CSIP  is  the  primary  instrument  to 
gather  the  empirical  data.  The  CSIP  will  then  provide  the 
data  that  researchers  will  need  to  populate  the  Template  for 
Analysis  elements  for  each  case  study.  These  protocol 
instruments  have  been  devised  by  the  Authenticity  Task 
Force  to  ensure  that  inten.iews  carried  out  by  the 
InterPARES  case  studies  are  conducted  under  comparable 
conditions  at  each  institution.  Currently,  case  studies  are 
being  conducted  with  a  variety  of  institutions  in  Canada, 
the  United  States.  Europe  (Italy.  United  Kingdom.  Ireland. 
Sweden.  France,  and  the  Netherlands).  Australia,  and  Asia 
(China  and  Hong  Kong)  as  well  as  a  global  industry  group 
that  includes  CENSA  (the  Collaborative  Electronic 
Notebook  Systems  Association).  Additional  information 
may  also  come  from  supporting  documentation  provided  by 
the  interviewee,  additional  comments  made  by  the 
interviewee,  external  documentation  from  or  about  the  case 
study  system  or  organization  or  other  identifiable  sources. 

To  date,  twelve  case  studies  for  round  1  and  nine  case 
studies  for  round  2  have  been  completed  or  are  underway. 
The  analysis  of  case  studies  focuses  on  the  specific 
characteristics  and  function  of  ensuring  authenticity  in  the 
preservation  of  digital  data  and  records.  Among  the  case 
studies,  there  are  the  multiple  case  studies  that  have  the 
similar  function  and  purposes  with  different  situational 
contexts.  For  example,  there  are  six  registration  systems 
being  conducted  in  six  different  institutions  in  fi\e  different 
countries.  There  are  five  student  records  systems  in  five 
universities  in  three  countries.  The  case  studies  with  the 
same  function  are  examined  to  identify  whether  the 
requirements  for  ensuring  authenticity  are  applicable  across 
juridical,  technological,  functional,  and  cultural  contexts. 


rV.  Implications  for  Further  Research 

The  results  of  the  InterPARES  project  will  be  used  as  the 
basis  for  developing  further  research  on  electronic  record- 
keeping systems.  A  methodological  typology  derived  from 
a  variety  of  case  studies  in  real-hfe  settings  will  be  the 
basis  for  developing  further  data  collection  instruments  and 
refining  data  analysis  methods,  which  can  then  be 
applicable  across  electronic  record-keeping  systems.  An 
in-depth  analysis  of  different  communities  of  practice 
would  yield  more  insight  into  the  ways  that  authentic  data 
and  records  can  be  understood,  used  and  managed  and  how 
common  requirements  of  ensuring  authenticity  can  be 
shared  across  jurisdictional  and  technological  boundaries. 
As  a  result  of  the  InterPARES  project  findings,  it  is  hoped 
that  standards  estabhshing  authenticity  of  electronic  records 
will  be  developed  that  will  be  applicable  across  many 
communities  of  practice  now  and  in  the  future. 

Findings  from  each  investigative  domain  are  expected  by 
December  2001.  However,  given  the  depth  of  the  problem 
domains  and  the  ongoing  iterative  process  of  designing, 
testing  and  analyzing  the  various  requirements  and 
methodologies,  it  is  anticipated  that  research  will  continue 
beyond  this  date.  Stay  tuned:  the  results  should  be  quite 
exciting. 
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The  ARL  GIS  Literacy  Project:  Support  for 
Government  Data  Services  in  the  Digital 

Library 


Abstract 

This  paper  describes  the  ARL  GIS 

Literacy  Project  and  its  role  in  providing 

support  for  continued  access  to 

government  data  which  is  increasingly 

distributed  only  in  digital  form.  In 

particular,  it  will  address  the  University 

of  Missouri's  (MU)  experience  in  the  ■■■■■^^^l 

broad  context  of  the  ARL  GIS  Literacy 

Project  goals  as  well  as  in  comparison  to  the  reported 

experiences  of  other  participating  institutions.  It  will 

discuss  what  MU  has  produced  in  terms  of  GIS  services 

and  what  has  been  learned  about  broadening  awareness  of 

GIS.  The  MU  experience  will  be  examined  as  an  example 

of  the  creation  of  support  mechanisms  for  integration  of 

GIS  into  the  digital  Ubrary  environment. 

Introduction 

When  geographic  information  systems  (GIS)  moved 
beyond  the  domain  of  professional  geographers  and  into 
"mainstream"  technology  in  the  early  1990s,  libraries 
began  seeking  effective  ways  to  utilize  this  powerful 
research  tool.  A  particular  focus  has  been  on  the  abiUty  of 
GIS  to  deal  with  government  data  that  is  crucial  to  social 
science  research  (as  well  as  many  other  disciphnes)  and 
which  is  increasingly  distributed  only  in  digital  form.  In 
the  midst  of  this,  the  Association  of  Research  Libraries 
(ARL)  established  the  GIS  Literacy  Project  as  a  support 
mechanism  for  hbraries  interested  in  learning  about  and 
introducing  GIS  into  their  services. 

As  a  graduate  student  in  the  University  of  Missouri's  (MU) 
library  and  information  science  program.  1  became 
intrigued  when  I  learned  in  a  government  information 
course  that  GIS  was  being  integrated  into  hbrary  services  in 
order  to  provide  access  to  digital  spatial  data.  I  was 
particularly  interested  when  I  learned  that  MU  had  been  an 
early  participant  in  the  ARL  project  and  set  out  to  find  out 
more  about  the  topic  through  a  literature  review  of  library 
GIS  services  and  the  ARL  project,  as  well  as  interviews 
with  staff  members  of  MU's  Ellis  library  regarding  the 
institution's  experience  with  the  project  and  the  current 
state  of  the  library's  GIS  services. 

This  paper  is  thus  a  summary  of  my  initial  exploration  into 
the  world  of  library  GIS  services.  It  provides  an  overview 
of  the  ARL  GIS  Literacy  Project  and  addresses  MU's 
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experience  in  the  broad  context  of  the 
Project  goals  as  well  as  in  comparison  to 
the  reported  experiences  of  other 
participating  institutions.  And  it  discusses 
what  MU  has  produced  in  terms  of  GIS 
services,  what  has  been  learned  about 
broadening  awareness  of  GIS.  and  MU's 
^^^^^^■^B     experiences  with  creating  support 
mechanisms  for  integration  of  GIS 
services  into  the  digital  library  environment. 

ARL  GIS  Literacy  Project 

The  ARL  GIS  Literacy  Project  was  initiated  in  1992  as  a 
multi-phased  project  in  partnership  with  ESRI  (the  leading 
producer  of  GIS  software )  and  other  public  and  private 
partners.  The  goals  of  the  Project  are  designed  to  meet  the 
current  needs  of  hbraries  and  users  while  addressing  the 
changes  libraries  are  undergoing  as  they  enter  the  21" 
century,  and  to  provide  the  tools  and  expertise  necessary  to 
insure  that  digital  government  information  can  be  used 
effectively  and  remain  in  the  pubhc  domain.  These  goals 
include: 

•  introduction  of  GIS  to  a  variety  of  hbraries  (e.g.. 
public,  state-based,  academic,  and  university  hbraries 
in  pubhc  and  private  institutions)  to  address  diverse 
user  information  needs: 

•  development  of  a  team  of  GIS  professionals  in  the 
research  library  community  willing  to  lend  time  and 
expertise  to  apphcations.  user  training,  and  education 
programs; 

•  encouragement  of  connections  among  federal, 
state,  and  local  GIS  users  and  information; 

•  promotion  of  research,  education,  and  the  pubhc 
right-to-know  through  improved  access  to 
government  information; 

•  initiation  of  library  projects  to  explore  new 
applications  of  spatially  referenced  data  and  evaluate 
the  introduction  of  these  services  in  research 
libraries;  and 

•  implementation  of  programs  to  allow  institutions 
that  have  invested  in  networking  capabiUties  to 
leverage  the  sharing  of  resources  via  networks. 

The  Project  seeks  to  provide  a  forum  for  hbraries  to 
experiment  and  engage  in  GIS  acti\ities  by  introducing, 
educating,  and  equipping  librarians  with  the  skills  needed 
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to  provide  access  to  digital  spatial  data.  In  cooperation 
with  GIS  vendors  and  foundations.  ARL  organizes  training 
sessions  for  Project  participants,  sponsors  an  electronic 
mail  list,  and  works  with  government  agencies  on  GIS 
programs  and  related  issues.  Financial  support,  data, 
software,  hardware,  and  expertise  to  assist  in  the  Project 
goals  have  been  provided  by  GIS  vendors  and  foundations. 
The  Project  is  still  active,  although  the  focus  has  shifted 
more  towards  enhancement  of  programs  at  institutions  with 
GIS  services  now  in  place  rather  than  the  earlier  focus  on 
introduction  of  these  services.  Occasional  training  sessions 
are  still  provided,  as  are  other  types  of  support  for 
institutions  wishing  to  develop  GIS  services. 

Participant  Experiences 

In  1997.  ARL  conducted  a  survey  to  determine  how.  in  the 
years  since  the  Project  began,  participants  have  organized 
their  delivery  of  GIS  (Davie  et  al.  1999).  The  sur\ey 
addressed  four  main  categories  of  GIS  service:  1 )  general 
information  about  the  library's  role  in  delivering  GIS.  2) 
the  number.  le\el  and  academic  preparation  of  other 
training  of  staff  involved.  3)  the  amount  and  kind  of 
equipment,  software,  and  data  files  that  support  GIS  in  the 
library,  and  4)  the  kind  of  service  offered  and  by  whom  it  is 
used.  Seventy-two  of  the  121  Project  participants 
responded  to  the  survey.  The  following  summarizes  the 
survey  results: 

General  Information 

•  89%  of  the  responding  institutions  reported  that 
they  provide  GIS  services. 

•  GIS  services  were  administered  by  the  library  at 
83%  of  the  institutions  and  by  academic  departments 
offering  GIS  courses  at  70%  of  the  institutions  (both 
the  library  and  academic  departments  administer  GIS 
services  at  many  institutions). 

•  Only  5%  of  the  libraries  reported  having  discrete 
GIS  units;  most  hbrary  GIS  services  were  found  to  be 
located  in  the  government  documents  center  (48%)  or 
map  center  (52%). 

Staffing 

•  At  8 1  %  of  the  libraries  with  GIS  services,  the 
services  were  directed  by  a  librarian  with  an  MLS: 
54%  of  those  librarians  held  at  least  one  additional 
graduate  degree. 

•  The  most  conmion  GIS  training  for  respondents 
was  through  ARL's  GIS  Literacy  Project;  others  had 
received  training  through  GIS  software  providers  or 
GIS  coursework. 

Infi-astnwTurc 

•  78%  of  the  libraries  with  GIS  services  utilized 
ESRLs  ArcView  software. 

•  58%  operated  their  GIS  on  Windows95/NP 
platforms.  56%  on  Windows  3.1;  the  remainder 
operated  on  DOS.  UNIX,  and  Macintosh  platforms. 


•  61  %  utihze  computer  networks  for  their  GIS 
services. 

•  The  Government  Printing  Office  depository 
program  provided  digital  data  files  used  for  GIS 
services  in  83%  of  the  libraries;  67%  supplement 
those  files  through  purchases  ( 70%  of  those  Ubraries 
had  funding  of  less  than  $2000  for  such  purchases). 

SeiTice 

•  53%  of  the  responding  Ubraries  offered  GIS 
support  service  20  hours  a  week  or  less.  24%  offered 
more,  and  three  institutions  offered  no  support  at  all. 

•  The  typical  number  of  users  of  library  GIS  services 
was  about  seven  per  week;  students  comprised  about 
half  of  those  users,  while  faculty,  staff,  businesses, 
local  government,  and  the  general  public  fairly 
equally  comprised  the  remainder  of  the  users. 

Also  in  1997.  ARL  pubUshed  Transfonning  Libraries: 
Issues  and  Innovations  in  Geographic  Infonnation  Systems. 
This  publication  presents  a  number  of  case  reports  that 
proN'ide  an  oven,  iew  of  experiences  and  lessons  learned  in 
attempting  to  develop  support  mechanisms  for  GIS  services 
in  libraries.  Included  is  a  set  of  questions  for  library 
planners  to  answer  in  designing  or  rethinking  GIS-based 
services: 

Key  Questions  for  Planners 

•  What  Kind  of  Service  Should  We  Provide? 

•  How  Will  Collecnons  Be  Built? 

•  Who  Will  Staff  the  GIS-Based  Services? 

•  How  Will  We  Learn  -  and  Educate  Others  -  about 
GIS? 

•  With  Whom  Will  We  Collaborate? 

•  How  and  Where  Will  We  Store  Data? 

•  What  Will  It  Cost? 

These  questions  point  to  the  critical  themes  that  emerged 
from  the  case  reports  -  themes  such  as  GIS  service 
planning,  partnering.  poUcy  development,  staff  training  and 
expertise,  resource  allocation,  and  user  support.   Following 
is  a  summary  of  three  case  reports  presented  in 
Transfonning  Libraries.  These  cases  are  selected  for 
discussion  because  they  provide  good  examples  of  ways  in 
which  institutions  have  successfully  addressed  particular 
issues  in  their  efforts  to  provide  support  for  GIS  services. 
Specifically,  the  University  of  Georgia  is  noted  for  its 
planning  efforts.  Penn  State  for  its  extensive  partnerships, 
and  North  Carolina  State  University  for  addressing  issues 
of  staff  training  and  expertise. 

Planning  -  University  of  Georgia  Libraries 

Careful  planning  proved  crucial  in  the  development  of  GIS 
services  at  the  University  of  Georgia  Libraries.   In  1994.  a 
comprehensiv e  survey  conmiissioned  by  university 
administration  and  issued  by  a  campus-wide  committee 
identified  current  and  future  campus  GIS  instructional  and 
research  activities.  GIS  software  needs,  and  a  host  of 
potential  GIS  services.  The  results  of  this  survey,  which 
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are  available  on  ARL"s  Transfonning  Libraries  GIS 
Website  (http://\v\v\v. arl.org/transfomi/gis/)  along  with  the 
survey  developed  by  the  University  of  Georgia,  allowed  the 
University's  Map  Library  to  create  a  small  GIS  Lab  and 
design  focused  and  responsive  services  which  provide 
patron  access  to  the  hbrary's  digital  spatial  data  as  well  as 
to  spatial  data  available  on  the  Internet. 

Partnerships  -  The  Pennsylvania  State  University 
Libraries 

Perm  State  University  Libraries  began  to  plan  for  GIS 
services  in  1995  with  a  brief  analysis  of  existing  resources 
that  revealed  a  lack  of  appropriate  coordination  of  GIS  data 
and  its  use.  A  new  mission  was  drafted  to  create  GIS 
ser\ices  which  would  pro\ide  for  acquisition,  organization, 
and  archiving  spatially  referenced  data  and  make  it 
available  to  the  widest  number  of  users  through  in-house 
facilities  and  the  Internet.  Partnership  development  and/or 
enhancement  of  existing  relationships  with  the  campus" 
geography  department,  cartography  lab,  computing  center, 
and  a  semi-independent  research  group  proved  crucial  in 
fulfilUng  this  mission.  These  partnerships  resulted  in  the 
creation  of  an  in-house  GIS  Center  at  the  University 
Libraries  with  trained,  continued  student  staffing  provided 
by  an  internship  program  with  the  Geography  Department 
and  hardware  and  software  support  pro\ ided  by  the  campus 
computing  center.  Additionally,  partnering  with  the 
campus  cartography  lab  and  a  semi-independent  research 
group  with  GIS  expertise  resulted  in  the  de\  elopment  of  a 
Web  interface  which  distributes  Pennsylvania-based  spatial 
information  both  within  and  beyond  campus  via  the 
Internet. 

Training  and  Expertise  -  North  Carolina  State 
University  Libraries 

In  the  early  1990s,  North  CaroUna  State  University 
organized  a  small  team,  consisting  of  Ubrarians  and  a 
computer  center  staff  member  in  haison  with  a  faculty 
member  with  GIS  expertise,  to  initiate  the  Libraries"  GIS 
start-up  effort.  The  team  quickly  identified  staff  and  user 
training  and  expertise  as  key  challenges  in  providing 
support  for  GIS-based  ser\  ices.  To  tackle  the  problem  of  a 
staff  with  little  GIS  expertise,  members  of  the  team  who 
acquired  training  began  to  offer  sessions  in  basic  GIS  skills 
to  other  staff  to  pro\  ide  them  with  an  understanding  of  the 
scope  of  the  services  and  enable  them  to  assist  in  user 
support.  The  GIS  team  also  developed  introductory  GIS 
workshops  and  classes  for  campus  faculty,  staff,  and 
students  which  would  enable  users  to  work  more 
independent!}  with  the  data  and  software.  Additionally,  a 
position  of  Librarian  for  Spatial  and  Numeric  Data  Sen,  ices 
was  created  to  pro\  ide  a  professional  staff  member  with  the 
appropriate  experience  and  proficiencies  to  carr\  the 
responsibilit\  for  de\  elopment  and  management  of  the 
libraries"  GIS  and  other  spatial  and  numeric  data  resources 
and  services.  Providing  in-house  training  and  expertise  for 
GIS  services  was  an  important  component  in  the 


development  of  significant  and  successful  GIS  services  at 
North  Carohna  State. 


The  MU  Experience 
Project  Participation 

The  University  of  Missouri-Columbia  was  an  early 
participant  in  the  ARL  GIS  Literacy  Project,  assigning  two 
professional  staff  members  to  participate  in  the  program  in 
addition  to  their  other  duties.  These  staff  members 
attended  training  sessions  and  the  hbrar\"s  data  services 
center  received  a  computer  and  software  to  begin 
experimentation  in  developing  GIS  services.  Staff 
members  found  it  to  be  an  extremely  time-consuming 
service  to  offer  and  time  and  staffing  restraints  have 
prevented  the  development  of  these  ser\ices.  The  data 
services  center  has  received  only  a  small  number  of 
requests  that  utiUzed  the  GIS  tools,  and  has  not  yet  been 
able  to  allocate  the  staff  time  and  other  resources  to 
maintain  the  requisite  hardware  and  software,  develop  the 
poUcies  and  expertise  necessary  to  support  GIS  services  in- 
house.  and  enable  them  to  pubUcize  these  tools  to  potential 
users.  Although  de\  elopment  of  GIS  services  remains  in 
the  library  "s  long-term  plans,  it  is  not  currently 
experiencing  the  demand  that  would  estabUsh  development 
of  these  sen.  ices  as  a  high  priority.  The  staff  members 
assigned  to  the  ARL  Project  remain  aware  of  GIS  activities, 
but  are  not  currently  actively  participating  in  the  Project. 

Access  to  Spatial  Data 

The  Uni\ersity  of  Missouri-Columbia  is  a  federal 
depository  Ubrary.  thus  its  Government  Documents  center 
receives  digital  spatial  data  that  it  must  pro\  ide  public 
access  to.  The  Government  Documents  center  has  a 
computer  that  meets  the  minimum  government  standards 
for  running  GIS  programs,  but  has  only  standard  printing 
capabilities  and  hmited  staff  expertise  and  time  for 
supporting  the  users  who  wish  to  utilize  and  manipulate  the 
data.  The  center  currently  only  receives  about  six  requests 
a  semester  for  GIS-based  data  and  these  are  primarily  from 
experienced  users  to  whom  the  materials  can  be  loaned  or 
who  require  minimal  in-house  support.  The  Government 
Documents  staff  finds  they  are  least  able  to  support  the 
casual  user  who  requires  more  intensive  support  to  work 
with  spatial  data.  The  staff  currently  plans  to  attempt  to 
increase  awareness  of  these  data  resources  in  the  user 
community,  an  effort  which  hasn't  taken  high  priority  in 
the  past  due  to  support  limitations.  The  Go\  emment 
Documents  center  also  has  a  loose  agreement  with  the 
campus"  Geographic  Resources  Center  (a  unit  of  the 
Department  of  Geography )  to  assist  with  GIS-related  needs 
that  can"t  be  met  by  the  library. 

Impact  of  Project  Participation 

In  m\  introduction.  I  note  that  MU's  experience  in  the  ARL 
Project  would  be  examined  in  this  paper  as  an  example  of 
the  creation  of  support  mechanisms  for  integration  of  GIS 
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into  the  digital  library  environment.  Although  MU  clearly 
has  not  yet  been  able  to  achieve  the  well-de\  eloped 
services  reported  in  some  of  the  case  reports,  a  level  of 
support  for  GIS  services  has  resulted  from  participation  in 
the  Project.  GIS  services  have  been  introduced  into  the 
library,  a  level  of  staff  experience  in  using  the  software  and 
equipment  has  been  achieved,  there  has  been  a  broadening 
of  awareness  of  the  data  and  services  that  can  be  provided, 
and  government  distributed  spatial  data  is  accessible.  In 
fact,  looking  at  the  status  of  GIS  ser\'ices  at  MU  in 
comparison  to  the  findings  of  ARL's  1997  survey  suggests 
MU's  experience  may  be  typical  of  that  of  many  other 
participating  institutions.  The  following  compares  MU's 
status  with  the  survey  results.  (Note:  MU  did  not 
participate  in  the  1997  survey). 

General  Infonnation 

•  MU  is  providing  GIS  services,  as  are  89%  of  the 
responding  institutions. 

•  Like  many  of  the  responding  institutions,  GIS 
services  are  provided  through  both  an  academic 
department  and  the  library. 

•  Library  GIS  services  are  currently  offered 
primarily  through  the  Government  Documents  center, 
like  those  at  48%  of  the  responding  institutions. 

Staffing 

•  MU's  library  GIS  services  are  supported  by 
librarians  with  an  MLS.  as  are  those  at  8 1  %  of  the 
responding  in.stitutions. 

•  Like  the  majority  of  the  responding  institutions, 
library  GIS  training  has  been  provided  primarily 
through  the  ARL  Project. 

Infyastnicture 

•  MU's  library  GIS  services  utilize  ESRI's  ArcView 
software,  as  do  78%  of  the  responding  institutions. 

•  MU's  library  GIS  services  are  operated  on  a 
Windows  platform,  as  are  the  majority  of  responding 
institutions. 

•  The  GPO  is  the  primary  source  of  MU's  digital 
spatial  data  files,  as  is  the  case  at  83%  of  the 
responding  institutions. 

Senice 

•  MU  offers  less  than  20  hours  per  week  of  library 
GIS  support  service,  as  do  many  of  the  responding 
institutions 

•  MU  currently  experiences  a  very  low  demand  for 
library  GIS  services  -  only  a  few  requests  per 
semester,  as  opposed  to  the  .seven  per  week  which  the 
survey  found  typical. 

Future  Possibilities 

The  achievements  mentioned  above  pro\  ide  the 
groundwork  on  which  MU  can  continue  to  build  their  GIS 
services,  particularly  if  they  utilize  the  experiences  and 


suggestions  offered  by  other  Project  participants  such  as  the 
three  case  previously  summarized.  UtiUzing  the  University 
of  Georgia's  survey  might  assist  in  shaping  GIS  service 
planning  efforts.  Development  of  in-house  workshops  and 
expertise,  like  the  North  Carohna  State  example,  could 
assist  in  addressing  support  issues. 

Additionally.  MU's  existing  interdisciplinary  resources 
appear  to  hold  great  potential  for  exploration  of 
partnerships  like  those  that  have  been  implemented  to 
support  hbrary  GIS  services  at  Perm  State.  MU's 
Geographic  Resources  Center,  a  multidisciplinary  appUed 
research  and  teaching  facility  for  geographic  and  remote 
sensing  data  analysis,  already  provides  some  support  in 
meeting  GIS-related  requests  received  by  the  library.  The 
Missouri  Spatial  Data  Information  Service,  which  provides 
GIS  and  Census  data  about  Missouri  via  the  Internet,  is  run 
in  close  association  with  the  Geographic  Resources  Center 
and  is  another  rich  resource  for  creating  serxice  and 
resource  partnerships.  The  MU  Integrated  Spatial  Analysis 
of  En\ironmental  Systems  Mission  Enhancement  Proposal, 
sponsored  by  the  School  of  Natural  Resources.  Department 
of  Geography,  and  College  of  Engineering,  has  received 
administrative  funding  and  support.  This  proposal  seeks  to 
focus  MU's  efforts  in  the  geographic  information  sciences 
and  to  enable  participation  in  a  global  network  of  research 
and  outreach  in  the  analysis  of  geographic  information 
integrated  across  traditional  disciplinary  boundaries. 
Ahhough  this  mission  enhancement  area  does  not 
specifically  include  development  of  hbrary  GIS  services, 
libran."  staff  do  participate  in  meetings  of  the  mission 
enhancement  area  group  and  the  proposal  and  its  support 
pro\ide  groundwork  which  the  librarv'  can  utihze  in  its  own 
proposals  to  obtain  support  for  GIS  services. 

Conclusion 

There  is  a  substantial  subset  of  library  literature  which 
focuses  on  GIS  services,  much  of  which  consists  of 
accounts  of  institutions  who  have  participated  in  the  ARL 
GIS  Literacy  Project.  In  addition,  there  are  email  lists  (e.g.. 
gis41ib(g'u. washington.edu  ),  websites 

(e.g.,  www.mcmaster.ca/librar\/maps/gis  hbr.htm  ),  and 
conferences  (e.g,  ESRI's  International  Conference  on  GIS 
in  Education  and  Libraries)  which  have  focused  on  this 
topic.  Having  reviewed  information  from  a  variety  of  these 
resources,  I've  drawn  the  following  conclusions: 

•    Libraries  seem  to  achieve  great  success  in 
de\ eloping  their  GIS  services  when  they  focus  on  the 
particular  issues  or  areas  that  work  best  within  their 
larger  institutional  context  for  creating  support 
mechanisms  for  their  services.  These  issues/areas 
include  planning,  partnering,  policy  development, 
staff  training  and  expertise,  resource  allocation,  and 
user  support,  all  of  which  are  essential  to  creation  of 
support  mechanisms  but,  as  the  case  studies  above 
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have  illustrated,  can  have  varying  roles  in  developing 
support  mechanisms. 

•  The  ARL  Project  has  offered  important  assistance 
in  developing  the  support  mechanisms  which  have 
enabled  the  creation  of  successful  GIS  services  at 
many  of  the  participating  mstitudons;  however,  there 
are  many  institutions  that  have  yet  to  be  heard  from. 
Only  72  of  the  Project's  121  participants  responded 
to  the  1997  survey  and  the  Uterature  tends  to  focus  on 
those  institutions  who  ha\e  been  able  to  more  fully 
develop  their  services.  Gathering  a  wider  range  of 
case  reports  from  those  institutions  that  may  still  be 
struggUng  to  implement  their  GIS  services  will  be 
essential  in  continuing  to  find  way  s  to  further 
support  mechanisms  for  these  services. 

•  A  variety  of  hmitations  have  thus  far  prevented 
MU  from  achieving  the  same  level  of  growth  in  their 
GIS  services  as  some  other  institutions:  however, 
participation  in  the  FVoject  has  provided  a  forum  for 
the  Ubrary  to  experiment  and  engage  in  GIS  activities 
and  broaden  awareness  of  the  potential  this  tool  may 
hold  for  the  Ubrary"s  long-term  plans  to  provide 
access  to  and  support  for  digital  data  resources. 


Mount.  Jack  D..  Robert  Change,  and  Patricia  J.  Morris. 
"Planning  and  Developing  a  GIS  Program  in  a  Large 
Academic  Library."  ( 1998  ESRl  International  User 
Conference  Proceedings )  fwww.esri.com/librarv/userconf/ 
proc98/PROCEED/T6600/PAP55l/P55l.HTM1 

Soete.  George  J.  Transfonning  Libraries  2:  Issues  and 
Innovations  in  Geographic  Infonnation  Systems. 
(Washington:  Association  of  Research  Libraries,  SPEC  Kit 
219.  February  1997).  [www.arl.org/transform/gis/ 
gistrans.html] 

University  of  Missouri-Columbia.  Office  of  the  Provost. 
"Mission  Enhancement  -  Global  Infonnation  (MOGAIA) 
Funded  Proposals.  Global  Access  to  the  Information  Age: 
Integrated  Spatial  Analysis  of  Environmental  Systems" 
rweb.missouri.edu/--pro\ost/ME- 
MOGAIA  Abstracts.html] 

*  Paper  presented  at  lASSlST  2000  (Chicago.  7-10  June 
2000).  Mary  French  School  of  Information  Science  & 
Learning  Technologies  University  of  Missouri-Columbia 
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